You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I do have a question guys, I was trying to train a sft qwen2.5-coder model from qwen2.5-instruction-7b. my task is NL2SQL(from human prompt to sql code) I used 10k training data and the sft result went well . got like 5% increasing (sft-qwen2.5-coder-7b vs base - qwen2.5-coder-7b). however, I also tried to training a thinking-qwen2.5-coder-7b , I first use o1 to generate some cot data like :
as you can see in this figure , I use <|image_pad|> and <|video_pad|> to replace<think> </think>. and when inference , I put <|image_pad|> right after the prompt and hope the model can generate the thinking process and the answers.
I was trying with 2500 this kind of "thinking data" (far less than 10k normal sft data ) to train , I trained 50000steps and the model accuracy is even lower than base - qwen2.5-coder-7b. I did have thinking process, but the answer is just not as good as sft- qwen2.5-coder-7b . I really wanna ask for the reason or my mistakes , thanks guys . really appreciate
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I do have a question guys, I was trying to train a sft qwen2.5-coder model from qwen2.5-instruction-7b. my task is NL2SQL(from human prompt to sql code) I used 10k training data and the sft result went well . got like 5% increasing (sft-qwen2.5-coder-7b vs base - qwen2.5-coder-7b). however, I also tried to training a thinking-qwen2.5-coder-7b , I first use o1 to generate some cot data like :


as you can see in this figure , I use
<|image_pad|>
and<|video_pad|>
to replace<think> </think>
. and when inference , I put<|image_pad|>
right after the prompt and hope the model can generate the thinking process and the answers.I was trying with 2500 this kind of "thinking data" (far less than 10k normal sft data ) to train , I trained 50000steps and the model accuracy is even lower than base - qwen2.5-coder-7b. I did have thinking process, but the answer is just not as good as sft- qwen2.5-coder-7b . I really wanna ask for the reason or my mistakes , thanks guys . really appreciate
Beta Was this translation helpful? Give feedback.
All reactions